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Abstract —In this paper, we first discuss the definition of modularity (Q) used as a metric for community quality and then we review the 
modularity maximization approaches which were used for community detection in the last decade. Then, we discuss two opposite yet 
coexisting problems of modularity optimization: in some cases, it tends to favor small communities over large ones while in others, large 
communities over small ones (so called the resolution limit problem). Next, we overview several community quality metrics proposed to 
solve the resolution limit problem and discuss Modularity Density (Qds) which simultaneously avoids the two problems of modularity. 
Finally, we introduce two novel fine-tuned community detection algorithms that iteratively attempt to improve the community quality 
measurements by splitting and merging the given network community structure. The first of them, referred to as Fine-tuned Q, is based 
on modularity (Q) while the second one is based on Modularity Density {Qda) and denoted as Fine-tuned Qda- Then, we compare the 
greedy algorithm of modularity maximization (denoted as Greedy Q), Fine-tuned Q, and Fine-tuned Qda on four real networks, and 
also on the classical clique network and the LFR benchmark networks, each of which is instantiated by a wide range of parameters. The 
results indicate that Fine-tuned Qda is the most effective among the three algorithms discussed. Moreover, we show that Fine-tuned 
Qda can be applied to the communities detected by other algorithms to significantly improve their results. 

Index Terms —Community Detection, Modularity, Maximization, Fine-tuned. 
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1 Introduction 

M any networks, including Internet, citation net¬ 
works, transportation networks, email networks, 
and social and biochemical networks, display commu¬ 
nity structure which identifies groups of nodes within 
which connections are denser than between them [1]. 
Detecting and characterizing such commimity structure, 
which is known as community detection, is one of the 
fundamental issues in the study of network systems. 
Community detection has been shown to reveal latent 
yet meaningful structure in networks such as groups 
in online and contact-based social networks, functional 
modules in protein-protein interaction networks, groups 
of customers with similar interests in online retailer 
user networks, groups of scientists in interdisciplinary 
collaboration networks, etc. [2]. 

In the last decade, the most popular commimity de¬ 
tection methods have been to maximize the quality 
metric known as modularity [1], [3]-[5] over all possible 
partitions of a network. Such modularity optimization 
algorithms include greedy algorithms [6]-[9], spectral 
methods [3], [10]-[15], extremal optimization [16], simu¬ 
lated annealing [17]-[20], sampling technique [21], and 
mathematical programming [22]. Modularity measures 
the difference between the actual fraction of edges within 
the community and such fraction expected in a random¬ 
ized graph with the same number of nodes and the same 
degree sequence. It is widely used as a measurement 
of strength of the community structures detected by the 
community detection algorithms. However, modularity 
maximization has two opposite yet coexisting problems. 


In some cases, it tends to split large communities into 
two or more small communities [23], [24]. In other 
cases, it tends to form large communities by merging 
communities that are smaller than a certain threshold 
which depends on the total number of edges in the 
network and on the degree of rnter-cormectivity between 
the communities. The latter problem is also known as the 
resolution limit problem [23]-[25]. 

To solve these two issues of modularity, several com¬ 
munity quality metrics were introduced, including Mod¬ 
ularity Density (Qds) [23], [24] which simultaneously 
avoids both of them. We then propose two novel fine- 
tuned community detection algorithms that repeatedly 
attempt to improve the quality measurements by split¬ 
ting and merging the given community structure. We 
denote the corresponding algorithm based on modular¬ 
ity (Q) as Fine-tuned Q while the one based on Modularity 
Density (Qds) is referred to as Fine-tuned Qds- Finally, 
we evaluate the greedy algorithm of modularity max¬ 
imization (denoted as Greedy Q), Fine-tuned Q, and Fine- 
tuned Qds by using seven community quality metrics 
based on ground truth communities. These evaluations 
are conducted on four real networks, and also on the 
classical clique network and the LFR benchmark net¬ 
works, each of which is instantiated by a wide range 
of parameters. The results indicate that Fine-tuned Qds 
is the most effective method and can also dramatically 
improve the community detection results of other algo¬ 
rithms. Further, all seven quality measurements based 
on ground truth communities are consistent with Qds, 
but not consistent with Q, which implies the superiority 
of Modularity Density over the original modularity. 
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2 Review of Modularity Related Liter¬ 
ature 

In this section, we first review the definition of modu¬ 
larity and the corresponding optimization approaches. 
Then, we discuss the two opposite yet coexisting prob¬ 
lems of modularity maximization. Finally, we overview 
several community quality measurements proposed to 
solve the resolution limit problem and then discuss 
Modularity Density (Qds) [23], [24] which simultaneously 
avoids these two problems. 


2.1 Definition of Moduiarity 

Comparing results of different network partitioning al¬ 
gorithms can be challenging, especially when network 
structure is not known beforehand. A concept of modu¬ 
larity defined in [1] provides a measure of the quality 
of a particular partitioning of a network. Modularity 
(Q) quantifies the community strength by comparing 
the fraction of edges within the community with such 
fraction when random cormections between the nodes 
are made. The justification is that a community should 
have more links between themselves than a random 
gathering of people. Thus, the Q value close to 0 means 
that the fraction of edges inside communities is no better 
than the random case, and the value of 1 means that a 
network community structure has the highest possible 
strength. 

Formally, modularity (Q) can be defined as [1]: 


(?= E 

CiEC 

where C is the set of all the communities, q is a specific 
community in C, | if™ | is the number of edges between 
nodes within community Ci, |if™*| is the number of 
edges from the nodes in community Ci to the nodes 
outside Ci, and |if| is the total number of edges in the 
network. 

Modularity can also be expressed in the following 
form [3]: 
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where ki is the degree of node i, Aij is an element of the 
adjacency matrix, (5ci,c is the Kronecker delta symbol, 
and Ci is the label of the community to which node i is 
assigned. 

Since larger Q means a stronger community structure, 
several algorithms which we will discuss in the next 
section, are based on modularity optimization. 

The modularity measure defined above is suitable only 
for undirected and unweighted networks. However, this 
definition can be naturally extended to apply to directed 
networks as well as to weighted networks. Weighted 
and directed networks contain more information than 
undirected and unweighted ones and are therefore often 


viewed as more valuable but also as more difficult to 
analyze than their simpler counterparts. 

The revised definition of modularity that works for 
directed networks is as follows [4]: 


Q 
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where fc™ and A:°“‘ are the in- and out- degrees. 

Although many networks can be regarded as binary, 
i.e. as either having an edge between a pair of nodes or 
not having it, there are many other networks for which 
it would be natural to treat edges as having a certain 
degree of strength or weight. 

The same general techniques that have been de¬ 
veloped for unweighted networks are applied to its 
weighted counterparts in [5] by mapping weighted 
networks onto multigraphs. For non-negative integer 
weights, an edge with weight u> in a weighted graph cor¬ 
responds to w parallel edges in a corresponding multi¬ 
graph. Although negative weights can arise in some 
applications they are rarely useful in social networks, so 
for the sake of brevity we will not discuss them here. It 
turns out that an adjacency matrix of a weighted graph 
is equivalent to that of a multigraph with unweighted 
edges. Since the structure of adjacency matrix is inde¬ 
pendent of the edge weights, it is possible to adjust all 
the methods developed for unweighted networks to the 
weighted ones. 

It is necessary to point out that the notion of degree of 
a node should also be extended for the weighted graphs. 
In this case degree of a node is defined as the sum of 
weights of all edges incident to this node. 

It is shown in [5] that the same definitions of modu¬ 
larity that were given above hold for the weighted net¬ 
works as well if we treat Aij as the value that represents 
weight of the cormection and set \E\ = i ^ Aij. 

ij 


2.2 Modularity Optimization Approaches 

In the literature, a high value of modularity (Q) in¬ 
dicates a good commimity structure and the partition 
corresponding to the maximum value of modularity on 
a given graph is supposed to have the highest quality, 
or at least a very good one. Therefore, it is natural to 
discover communities by maximizing modularity over 
all possible partitions of a network. However, it is 
computationally prohibitively expensive to exhaustively 
search all such partitions for the optimal value of modu¬ 
larity since modularity optimization is known to be NP- 
hard [26]. However, many heuristic methods were intro¬ 
duced to find high-modularity partitions in a reasonable 
time. Those approaches include greedy algorithms [6]- 
[9], spectral methods [3], [10]-[15], extremal optimization 
[16], simulated armealing [17]-[20], sampling technique 
[21], and mathematical programming [22]. In this section, 
we will review those modularity optimization heuristics. 
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2.2.1 Greedy Algorithms 

The first greedy algorithm was proposed by Newman 
[6]. It is a agglomerative hierarchical clustering method. 
Initially, every node belongs to its own community, 
creating altogether |F| communities. Then, at each step, 
the algorithm repeatedly merges pairs of commimities 
together and chooses the merger for which the resulting 
modularity is the largest. The change in Q upon joining 
two communities and cj is 


AQc.c, = 2 



4\E\^ )' 


( 4 ) 


where |i?ci.c | is the number of edges from community 
Ci to community Cj and |i?cj = 2|£^™| + is the 

total degrees of nodes in community Ci. AQ^^Cj can be 
calculated in constant time. The algorithm stops when 
all the nodes in the network are in a single commu¬ 
nity after (|y| — 1) steps of merging. Then, there are 
totally \V\ partitions, the first one defined by the initial 
step and each subsequent one resulting from each of 
the subsequent (|F| — 1) merging steps. The partition 
with the largest value of modularity, approximating the 
modularity maximum best, is the result of the algorithm. 
At each merging step, the algorithm needs to compute 
the change AQci.c of modularity resulting from joining 
any two currently existing communities Ci and Cj in 
order to choose the best merger. Since merging two 
discormected communities will not increase the value of 
modularity, the algorithm checks only the merging of 
cormected pairs of communities and the number of such 
pairs is at most \E\ limiting the complexity of this part 
to 0{\E\). However, the rows and columns of adjacent 
matrix corresponding to the two merged communities 
must be updated, which takes 0(|H|). Since there are 
(|H| — 1) iterations, the final complexity of the algorithm 
is 0{{\E\ + |H|)|H|), or 0(|Hp) for sparse networks. 

Although Newman's algorithm [6] is much faster 
than the algorithm of Newman and Girvan [1] whose 
complexity is 0(|i5p|H|), Clauset et al. [7] pointed out 
that the update of the adjacent matrix at each step 
contains a large number of urmecessary operations when 
the network is sparse and therefore its matrix has a 
lot of zero entries. They introduced data structures for 
sparse matrices to perform the updating operation more 
efficiently. In their algorithm, instead of maintaining the 
adjacent matrix and computing AQ^^Cj, they maintained 
and updated the matrix with entries being AQ^^c for 
the pairs of connected communities Ci and Cj. The au¬ 
thors introduced three data structures to represent sparse 
matrices efficiently: (1) each row of the matrix is stored 
as a balanced binary tree in order to search and insert 
elements in 0{log\V\) time and also as a max-heap so as 
to locate the largest element of each row in constant time; 
(2) another max-heap stores the largest element of each 
row of the matrix so as to locate the largest AQci.cj 
constant time; (3) a vector is used to save {E^l for each 
community q. Then, in each step, the largest AQ^i.c^ 
can be found in constant time and the update of the 


adjacent matrix after merging two communities and 
Cj takes 0{{kci + kcj)log\V\), where and k^^ are the 
numbers of neighboring communities of communities 
and Cj, respectively. Thus, the total running time is at 
most 0{log\V\) times the sum of the degrees of nodes 
in the communities along the dendrogram created by 
merging steps. This sum is in the worst case the depth of 
the dendrogram times the sum of the degrees of nodes in 
the network. Suppose the dendrogram has depth d, then 
the rimning time is 0{d\E\log\V\), or 0{\V\log^\V\) when 
the network is sparse and the dendrogram is almost 
balanced (d ~ log\V\). 

However, Wakita and Tsurumi [8] observed that the 
greedy algorithm proposed by Clauset et al. is not scal¬ 
able to networks with sizes larger than 500,000 nodes. 
They found that the computational inefficiency arises 
from merging communities in an unbalanced marmer, 
which yields very unbalanced dendrograms. In such 
cases, the relation d ~ does not hold any more, 

causing the algorithm to run at its worst-case complexity. 
To balance the merging of communities, the authors in¬ 
troduced three types of consolidation ratios to measure the 
balance of the community pairs and used it with mod¬ 
ularity to perform the joining process of communities 
without bias. This modification enables the algorithm to 
scale to networks with sizes up to 10,000,000. It also 
approximates the modularity maximum better than the 
original algorithm. 

Another type of greedy modularity optimization al¬ 
gorithm different from those above was proposed by 
Blondel et al., and it is usually referred to as Louvain [9]. 
It is divided into two phases that are repeated iteratively. 
Initially, every node belongs to the community of itself, 
so there are |17| commimities. In this first phase, every 
node, in a certain order, is considered for merging into 
its neighboring communities and the merger with the 
largest positive gain is selected. If all possible gains 
associated with the merging of this node are negative, 
then it stays in its original community. This merging 
procedure repeats iteratively and stops when no increase 
of Q can be achieved. 

After the first phase, Louvain reaches a local maxi¬ 
mum of Q. Then, the second phase of Louvain builds a 
community network based on the communities discov¬ 
ered in the first phase. The nodes in the new network are 
the communities from the first phase and there is a edge 
between two new nodes if there are edges between nodes 
in the corresponding two communities. The weights of 
those edges are the sum of the weights of the edges 
between nodes in the corresponding two communities. 
The edges between nodes of the same community of the 
first phase result in a self-loop for this community node 
in the new network. After the community network is 
generated, the algorithm applies the first phase again on 
this new network. The two phases repeat iteratively and 
stop when there is no more change and consequently 
a maximum modularity is obtained. The number of 
iterations of this algorithm is usually very small and 
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most of computational time is spent in the first iteration. 
Thus, the complexity of fhe algorithm grows like 0{\E\). 
Consequently, it is scalable to large networks with the 
number of nodes up to a billion. However, the results of 
Louvain are impacted by the order in which the nodes 
in the first phase are considered for merging [27]. 


2.2.2 Spectral Methods 

There are two categories of spectral algorithms for max¬ 
imizing modularity: one is based on the modularity 
matrix [3], [10], [11]; the other is based on the Laplacian 
matrix of a network [12]-[14]. 

A. Modularity optimization using the eigenvalues 
and eigenvectors of the modularity matrix [3], [10], [11]. 

Modularity (Q) can be expressed as [3] 


Q 



W\) 

kikj \ 1 T D 

m) ’ 


(5) 


where A^ are the elements of adjacent matrix A and 
s is the column vector representing any division of the 
network into two groups. Its elements are defined as 
Si = -1-1 if node i belongs to the first group and Si = — 1 
if it belongs to the second group. B is the modularity 
matrix with elements 


2|£;| ■ 


( 6 ) 


Representing s as a linear combination of the normalized 
eigenvectors Ui of JB: s = X]l=i with ai = uj ■ s, 
and then plugging the result into Equation (5) yield 

Q E E E (7) 

t 3 ^ 


where j3i is the eigenvalue of B corresponding fo eigen- 
vecfor Ui. To maximize Q above, Newman [3] pro¬ 
posed a specfral approach to choose s proportional to 
the leading eigenvector iti corresponding to the largest 
(most positive) eigenvalue /3i- The choice assumes that 
the eigenvalues are labeled in decreasing order /3i > 
/?2 > > P\v\- Nodes are then divided into two 

communities according to the signs of the elements in 
s with nodes corresponding to positive elements in 
s assigned to one group and all remaining nodes to 
another. Since the row and colurrm sums of B are zero, 
it always has an eigenvector (1,1,1, ■•■) with eigenvalue 
zero. Therefore, if if has no positive eigenvalue, then the 
leading eigenvector is (l,l)lj - )/ which means that the 
network is indivisible. Moreover, Newman [3] proposed 
to divide network into more than two communities by 
repeatedly dividing each of the communities obtained so 
far into two until the additional contribution AQ to the 


modularity made by the subdivision of a communify c 


AQ = 
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m 
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m' 
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( 8 ) 


is equal to or less than 0. B^^^ in the formula above is 
fhe generalized modularity matrix. Its elements, indexed 
by the labels i and j of nodes wifhin community c, are 

(9) 

k£c 


Then, the same spectral method can be applied to B^‘^'> to 
maximize AQ. The recursive subdivision process stops 
when AQ < 0, which means that there is no positive 
eigenvalue of fhe mafrix B^’^\ The overall complexify of 
fhis algorifhm is 0{{\E\ -f |H|)|H|). 

However, the spectral algorithm described above has 
two drawbacks. First, it divides a network into more 
than two communities by repeated division instead of 
geftrng all fhe communities directly in a single step. 
Second, it only uses the leading eigenvector of the 
modularity matrix and ignores all the others, losing all 
the useful information contained in those eigenvectors. 
Newman later proposed to divide a network into a set 
of communities C with |C| > 2 directly using multiple 
leading eigenvectors [10]. Let S = (sc) be an \V\ x \C\ 
"community-assignment" matrix with one column for 
each community c defined as 

J 1 if node i belongs to community c, 

1 0 otherwise. 


then the modularity (Q) for this direct division of the 
network is given by 


Nl 


^ ^ E E = _Tr(5^BS), (11) 


i,j=l cGC 




where BS) is the trace of matrix S^BS. Defining 

B = UTXJ'^, where U = (tti, 1x2, ■•■) is the matrix 
of eigenvecfors of B and S is fhe diagonal mafrix of 
eigenvalues Hu = /3i, yields 

' ' i=icec 


Then, obfaining \C\ communifies is equivalent to select¬ 
ing jCI — 1 independent, mutually orthogonal columns 
Sc. Moreover, Q would be maximized by choosing the 
columns Sc proportional to the leading eigenvectors of 
B. However, only fhe eigenvectors corresponding to 
the positive eigenvalues will contribute positively to the 
modularity. Thus, the number of positive eigenvalues, 
plus 1, is the upper bound of \C\. More general modu¬ 
larity maximization is to keep the leading p (1 < p < |H|) 
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eigenvectors. Q can be rewritten as 


Q 


1 

W\ 


1 

W\ 


{\V\a + Tr[S'^J7(S - aI)U^S]) 

( iv^l 

^ j=i cec 

(13) 



where a (a < /3p) is a constant related to the approx¬ 
imation for Q obtained by only adopting the first p 
leading eigenvectors. By selecting \V\ node vectors r* 
of dimension p whose jth component is 

Ni = Vi^j - (14) 

modularity can be approximated as 

= ^ (l^l« + E|i2cp) , (15) 

where Rc, c G C, are the community vectors 

Rc = ^ri. (16) 

i^c 

Thus, the community detection problem is equivalent to 
choosing such a division of nodes into \C\ groups that 
maximizes the magnitudes of the community vectors Rc 
while requiring that Rc ■ n > 0 if node i is assigned 
to community c. Problems of this type are called vector 
partitioning problems. 

Although [10] explored using multiple leading eigen¬ 
vectors of the modularity matrix, it did not pursue it 
in detail beyond a two-eigenvector approach for bi¬ 
partitioning [3], [10]. Richardson et al. [11] provided 
a extension of these recursive bipartitioning methods 
by considering the best two-way or three-way division 
at each recursive step to more thoroughly explore the 
promising partitions. To reduce the number of parti¬ 
tions considered for the eigenvector-pair tripartition¬ 
ing, the authors adopted a divide-and-conquer method 
and as a result yielded an efficient approach whose 
computational complexity is competitive with the two- 
eigenvector bipartitioning method. 

B. Modularity optimization using the eigenvalues 
and eigenvectors of the Laplacian matrix [12]-[14]. 

Given a partition C (a set of communities) and the cor¬ 
responding "community-assignment" matrix S = (sc). 
White and Smyth [12] rewrote modularity {Q) as follows: 

Q cx Tr)^^)^ - b)S) = -Tr(S'^ilQS'), (17) 

where W = 2|£1|A and the elements of D are Dij = kikj. 
The matrix Lq = D — W is called the "Q-Laplacian". 
Finding the "community-assignment" matrix S that 
maximizes Q above is NP-complete, but a good approx¬ 
imation can be obtained by relaxing the discreteness 
constraints of the elements of S and allowing them 
to assume real values. Then, Q becomes a continuous 
function of S and its extremes can be foimd by equating 


its first derivative with respect to S to zero. This leads 
to the eigenvalue equation: 

LqS = SA, (18) 

where A is the diagonal matrix of Lagrangian multi¬ 
pliers. Thus, the modularity optimization problem is 
transformed into the standard spectral graph partition¬ 
ing problem. When the network is not too small, Lq 
can be approximated well, up to constant factors, by the 
transition matrix W = D~^A obtained by normalizing 
A so that all rows sum to one. D here is the diagonal 
degree matrix of A. It can be shown that the eigenvalues 
and eigenvectors of W are precisely 1 — A and p, where 
A and p are the solutions to the generalized eigenvalue 
problem Lp = \Dp where L = D — A is the Laplacian 
matrix. Thus, the underlying spectral algorithm here is 
equivalent to the standard spectral graph partitioning 
problem which uses the eigenvalues and eigenvectors of 
the Laplacian matrix. 

Based on the above analysis. White and Smyth pro¬ 
posed two clustering algorithms, named "Algorithm 
Spectral-1" and "Algorithm Spectral-2", to search for a 
partition C with size up to K predefined by an input 
parameter. Both algorithms take the eigenvector matrix 
Uk = {ui,U 2 , ■■■,uk-i) with the leading K —1 eigen¬ 
vectors (excluding the trivial all-ones eigenvector) of the 
transition matrix W as input. Those K — 1 eigenvectors 
can be efficiently computed with the Implicitly Restarted 
Lanczos Method (IRLM) [28]. "Algorithm Spectral-1" 
uses the first fc — 1 (2 < fc < K) columns of Uk/ denoted 
as Uk-i, and clusters the row vectors of Uk-i using k- 
means to find a fc-way partition, denoted as Ck- Then, 
the Ck* with size k* that achieves the largest value of Q 
is the final community structure. 

"Algorithm Spectral-2" starts with a single community 
{k = 1) and recursively splits each community c into two 
smaller ones if the subdivision produces a higher value 
of Q. The split is done by rimning fc-means with two 
clusters on the matrix Uk,c formed from Uk by keeping 
only rows corresponding to nodes in c. The recursive 
procedure stops when no more splits are possible or 
when k = K communities have been found and then 
the final community structure with the highest value of 
Q is the detection result. 

However, the two algorithms described above, es¬ 
pecially "Algorithm Spectral-1", scale poorly to large 
networks because of running fc-means partitioning up to 
K times. Both approaches have a worst-case complexity 
0{K'^\V\ + K\E\). In order to speed up the calculation 
while retaining effectiveness in approximating the max¬ 
imum of Q, Ruan and Zhang [13] proposed the Kent 
algorithm which recursively partitions the network to 
optimize Q. At each recursive step, Kent adopts a fc-way 
partition (k = 2,3,...,Z)to the subnetwork induced by the 
nodes and edges in each community using "Algorithm 
Spectral-1" of White and Smyth [12]. Then, it selects the 
k that achieves the highest Q. Empirically, Kent with 
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I as small as 3 or 4 can significantly improve Q over 
the standard bi-partitioning method and it also reduces 
the computational cost to 0{{\V\ + \E\)log\C\) for a final 
partition with \C\ communities. 

Ruan and Zhang later [14] proposed QCUT algorithm 
that combines Kcut and local search to optimize Q. 
The QCUT algorithm consists of two alternating stages: 
partitioning and refinement. In the partitioning stage, 
Kcut is used to recursively partition the network until 
Q carmot be further improved. In the refinement stage, 
a local search strategy repeatedly considers two opera¬ 
tions. The first one is migration that moves a node from 
its current community to another one and the second 
one is the merge of two communities into one. Both 
are applied to improve Q as much as possible. The 
partitioning stage and refinement stage are alternating 
until Q cannot be increased further. In order to solve 
the resolution limit problem of modularity, the authors 
proposed HQUCT which recursively applies QCUT to 
divide the subnetwork, generated with the nodes and 
edges in each community, into subcommunities. Further, 
to avoid overpartitioning, they use a statistical test to 
determine whether a community indeed has intrinsic 
subcommunities. 

C. Equivalence of two categories of spectral algo¬ 
rithms for maximizing modularity [15]. 

Newman [15] showed that with hyperellipsoid relax¬ 
ation, the spectral modularity maximization method us¬ 
ing the eigenvalues and eigenvectors of the modularity 
matrix can be formulated as the spectral algorithm that 
relies on the eigenvalues and eigenvectors of Laplacian 
matrix. This formulation indicates that the above two 
kinds of modularity optimization approaches are equiv¬ 
alent. Starting with Equation (5) for the division of a 
network into two groups, first the discreteness of Si is 
relaxed onto a hyperellipsoid with the constraint 

( 19 ) 

i 

Then, the relaxed modularity maximization problem can 
be easily solved by setting the first derivative of Equa¬ 
tion (5) with respect to Si to zero. This leads to 


rewritten as 

'y ] ~ ki{Xsi + 2\j^\ y ^ kjSj), (23) 

j i 

or in matrix notion as 

As = D(As-f l^l), (24) 

where k is the vector with element ki and 1 = (I,!,!;-)- 
Then, multiplying the above equation by 1^ results in 
= 0. If there is a nontrivial eigenvalue A > 0, then 
the above equation simplifies to 

As = XDs. (25) 

Again, A should be the most positive eigenvalue. How¬ 
ever, the eigenvector corresponding to this eigenvalue 
is the uniform vector 1 which fails to satisfy fe^s = 0. 
Thus, in this case, one can do the best by choosing A 
to be the second largest eigenvalue and having s pro¬ 
portional to the corresponding eigenvector. In fact, this 
eigenvector is precisely equal to the leading eigenvector 
of Equation (21). Then, after defining a rescaled vector 
u = D^l'^s and plugging it into Equation (25), we get 

^ (26) 

The matrix L = is called the normalized 

Laplacian matrix. (The normalized Laplacian is some¬ 
times defined as L = I — , but those two 

differ only by a trivial transformation of their eigenval¬ 
ues and eigenvectors.) 

2.2.3 Extremal Optimization 

Duch and Arenas [16] proposed a modularity opti¬ 
mization algorithm based on the Extremal Optimization 
(EO) [29]. EO optimizes a global variable by improving 
extremal local variables. Here, the global variable is 
modularity (Q). The contribution of an individual node 
f to Q of the whole network with a certain community 
structure is given by 

qi = ki,c-ki\^, (27) 


y — XkiSi^ ( 20 ) 

j 

or in matrix notation 


where c is the number of edges that connect node 
i to the nodes in its own community c. Notice that 
Q = ft normalized into the interval 

[—1,1] by diving it by ki 


Bs = XDs, 


( 21 ) 


where A is the eigenvalue. Plugging Equation (20) into 
Equation (5) yields 


Q 


1 

m 


BijSiSj 

ij 




A 

2 ' 


( 22 ) 


Therefore, to achieve the highest value of Q, one should 
chose A to be the largest (most positive) eigenvalue of 
Equation (21). Using Equation (6), Equation (20) can be 


Qi ki^c \Bc\ 

ki ki 2|i?| 


(28) 


where Xi, called fitness, is the relative contribution of 
node i to Q. Then, the fitness of each node is adopted 
as the local variable. 

The algorithm starts by randomly splitting the net¬ 
work into two partitions of equal number of nodes, 
where communities are the connected components in 
each partition. Then, at each iteration, it moves the 
node with the lowest fitness from its own community to 
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another community. The shift changes the community 
structure, so the fitness of many other nodes needs 
to be recomputed. The process repeats until it can¬ 
not increase Q. After that, it generates sub-community 
networks by deleting the inter-community edges and 
proceeds recursively on each sub-community network 
until Q cannot be improved. Although the procedure 
is deterministic when given the initialization, its final 
result in fact depends on the initialization and it is likely 
to get trapped in local maxima. Thus, a probabilistic 
selection called t-EO [29] in which nodes are ranked 
according to their fitness and a node of rank r is selected 
with the probability P{r) oc is used to improve the 
result. The computational complexity of this algorithm 
is 0{\V\^log^\V\). 

2.2.4 Simulated Annealing 

Simulated armealing (SA) [30] is a probabilistic proce¬ 
dure for the global optimization problem of locating a 
good approximation to the global optimum of a given 
function in a large search space. This technique was 
adopted in [17]-[20] to maximize modularity (Q). The 
initial point for all those approaches can be arbitrary 
partitioning of nodes into commimities, even including 
\V\ communities in which each node belongs to its own 
community. At each iteration, a node i and a community 
c are chosen randomly. This community could be a 
currently existing community or an empty community 
introduced to increase the number of communities. Then, 
node i is moved from its original community to this new 
community c, which would change Q by AQ. If AQ is 
greater than zero, this update is accepted, otherwise it 
is accepted with probability where /3 in [17]-[19] 
represents the inverse of temperature T and /3 in [20] 
is the reciprocal of pseudo temperature r. In addition 
in [20], there is one more condition for the move of 
a node when c is not empty, shifting node i to c is 
considered only if there are some edges between node i 
and the nodes in c. To improve the performance and 
to avoid getting trapped in local minima, collective 
movements which involve moving multiple nodes at a 
time [19], [20], merging two communities [17]-[19], and 
splitting a community [17]-[19] are employed. Splits can 
be carried out in a number of different schemes. The 
best performance is achieved by treating a community 
as an isolated subnetwork and partitioning it into two 
and then performing a nested SA on these partitions 
[17], [18]. Those methods stop when no new update is 
accepted within a fixed number of iterations. 

2.2.5 Sampling Techniques 

Sales-Pardo et al. [21] proposed a "box-clustering" 
method to extract the hierarchical organization of net¬ 
works. This approach consists of two steps: (1) esti¬ 
mating the similarity, called "node affinity", between 
nodes and forming the node affinity matrix; (2) deriving 
hierarchical community structure from the affinity ma¬ 
trix. The affinity between two nodes is the probability 


that they are classified into the same commrmity in the 
local maxima partitions of modularity. The set of local 
maxima partitions, called Pmax, includes those partitions 
for which neither the moving of a node from its original 
community to another, nor the merging of two commu¬ 
nities will increase the value of modularity. The sample 
Pmax is found by performing the simulated armealing 
based modularity optimization algorithm of Guimera 
and Amaral [17], [18]. More specifically, the algorithm 
first randomly divides the nodes into commrmities and 
then performs the hill-climbing search until a sample 
with local maximum of modularity is reached. Then, the 
affinity matrix is updated based on the obtained sample. 

The sample generation procedure is repeated rmtil the 
affinity matrix has converged to its asymptotic value. 
Empirically, the total number of samples needed is pro¬ 
portional to the size of the network. Before proceeding 
to the second step, the algorithm assesses whether the 
network has a significant community structure or not. 
It is done by computing the 2 :-score of the average 
modularity of the partitions in Pmax with respect to 
the average modularity of the partitions with the local 
modularity maxima of the equivalent ensemble of null 
model networks. The equivalent null model is obtained 
by randomly rewiring the edges of the original net¬ 
work while retaining the degree sequence. Large z-score 
indicates that the network has a meaningful internal 
community structure. If the network indeed has a sig¬ 
nificant community structure, the algorithm advances 
to the second step to group nodes with large affinity 
close to each other. The goal is to bring the form of 
the affinity matrix as close as possible to block-diagonal 
structure by minimizing the cost function representing 
the average distance of matrix elements to the diagonal. 
Then, the communities corresponds to the "best" set 
of boxes obtained by least-squares fitting of the block- 
diagonal structure to the affinity matrix. The procedure 
described above can be recursively performed to sub¬ 
networks induced by communities to identify the low 
level structure of each community until no subnetwork 
is found to have significant intrinsic structure. 

2.2.6 Mathematical Programming 

Agarwal and Kempe [22] formulated the modularity 
maximization problem as a linear program and vector 
program which have the advantage of providing a pos¬ 
teriori performance guarantees. Eirst, modularity maxi¬ 
mization can be transformed into the integer program 

Maximize ^ 

(29) 

subject to Xik < Xij + Xjk for all i,j,k 
Xij € {0,1} for all i,j, 

where B is the modularity matrix and the objective 
function is linear in the variable Xij. When Xij = 0, i and 
j belong to the same community and Xij = 1 indicates 
that they are in different communities. The restriction 



Xik < Xij + Xjk requires that i and k are in the same 
community if and only if i, j, and k are in fhe same 
communify Solving the above integer program is NP- 
hard, but relaxing the last constraint that Xij is a integer 
from {0,1} to allow Xij be a real number in the interval 
[0,1] reduces the integer program to a linear program 
which can be solved in polynomial time [31]. However, 
the solution does not correspond to a partition when 
any of Xij is fractional. To get the communities from 
a roimding sfep is needed. The value of Xij is freated 
as the distance between i and j and these distances 
are used repeatedly to form communities of "nearby" 
nodes. Moreover, opfimizing modularity by dividing a 
network into two communities can be considered as a 
strict quadratic program 

Maximize -J— V By (1 + SiSj) 

V (30) 

subject to = 1 for all i, 

where the objective function is the same as Equation (5) 
defined by Newman [3]. Nofe thaf the constraint = 1 
ensures that = ±1 which implies that node i belongs 
either to the first or the second community. Quadratic 
programming is NP-complete, but it could be relaxed to 
a vector program by replacing each variable Si with \ V\- 
dimensional vector s and replacing the scalar product 
with the irmer vector product. The solution to vector 
program is one location per node on the surface of a \ V\- 
dimensional hypersphere. To obtain a bipartition from 
these node locations, a rounding step is needed which 
chooses any random (|H| — 1)-dimensional hyperplane 
passing through the origin and uses this hyperplane 
to cut the hypersphere into two halves and as a result 
separate the node vectors into two parts. Multiple ran¬ 
dom hyperplanes can be chosen and the one that gets 
the community structure with the highest modularity 
provides a solution. The same vector program is then re¬ 
cursively applied to subnetworks generated with nodes 
and edges in discovered communities to get hierarchical 
communities until Q carmot be increased. Following 
the linear program and vector program, Agarwal and 
Kempe also adopted a post-processing step similar to 
the local search strategy proposed by Newman [3] to 
further improve the results. 

2.3 Resolution limit 

Since its inception, the modularity has been used exten¬ 
sively as the measure of the quality of partitions pro¬ 
duced by community detection algorithms. In fact, if we 
adopt modularity as a quality measure of communities, 
the task of discovering communities is essentially turned 
into the task of finding the network partitioning with an 
optimal value of modularity. 

However as properties of the modularity were stud¬ 
ied, it was discovered that in some cases it fails fo detect 
small communities. There is a certain threshold [25], 
such that a community of the size below it will not be 


detected even if it is a complete subgraph cormected to 
the rest of the graph with a single edge. This property 
of modularity has become known as the resolution limit. 

Although the resolution limit prevents detection of 
small communities, the actual value of the threshold 
depends on the total number of edges in fhe nefwork 
and on fhe degree of interconnectedness between com¬ 
munities. In fact, the resolution limit can reach the values 
comparable to the size of fhe entire network causing 
formation of a few giant communities (or even a single 
community) and failing to detect smaller communities 
within them. It makes interpreting the results of com¬ 
munity detection very difficult because it is impossible 
to tell beforehand whether a community is well-formed 
or if it can be further split into subcommunities. 

Considering modularity as a fimction of the total num¬ 
ber of edges, \E\, and the number of communities, m, 
makes it possible to find the values of m and \E\ which 
maximize this function. It turns out that setting m = 
\/\E\ yields the absolute maximal value of modularity. 
Consequently, modularity has a resolution limit of order 
y/\E\ which bounds fhe number and size of communities 
[25]. In fact, if for a certain community the number of 

/\E\ 

edges inside it is smaller than y such community 
cannot be resolved through the modularity optimization. 
It is also possible for modularify optimization fo fail 
to detect commimities of larger size if fhey have more 
edges in common with the rest of the network. Therefore, 
by finding fhe opfimal value of fhe modularity we are 
generally not obtaining the best possible structure of 
communities. 

The above arguments can also be applied to weighted 
networks. In this case, \E\ is the sum of the weights of all 
the edges in the network, |B™| is the sum of the weights 
of fhe edges between nodes within community Ci, and 
is the sum of the weights of the edges from the 
nodes in community Ci to the nodes outside c^. 

By introducing an additional parameter, e, which rep¬ 
resents the weight of inter-community edges. Berry et al. 
showed in [32] that the number of commimities in the 
optimal solution is 



Correspondingly, any community for which ifs size 

(32) 

may not be resolved. 

Introduction of e brings some interesting opportuni¬ 
ties. If we can make e arbitrarily small, then we can 
expect maximum weighted modularity to produce any 
desired number of communities. In other words, given 
a proper weighting, a much better modularity resolu¬ 
tion can be achieved than without weighting. How¬ 
ever, in practice, finding a way fo set edge weights to 
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achieve small values of e can be challenging. An algo¬ 
rithm for lowering e proposed by Berry et al. requires 
0(m|y| log |f^|) time. 


Caution should be exercised when altering the weights 
of edges in the network to avoid changing its topolog¬ 
ical characteristics. To ensure this, a rescaled adjacency 
matrix can be defined as: 


2.4 Resolving the resolution limit problem 

There have been extensive studies done on how to miti¬ 
gate the consequences of the modularity resolution limit. 
The main approaches followed are described below. 

Localized modularity measure {LQ) [33] is based on 
the observation that the resolution limit problem is 
caused by modularity being a global measure since it as¬ 
sumes that edges between any pairs of nodes are equally 
likely, including cormectivity between the communities. 
However, in many networks, the majority of commu¬ 
nities have edges to only a few other communities, i.e. 
exhibit a local community connectivity. 

Thus, a local version of the modularity measure for a 
directed network is defined as: 


where is the total number of edges in the 

neighboring commimities of Ci, i.e. in the communities 
to which all neighbors of Ci belong. 

Unlike traditional modularity (Q), the local version of 
modularity (LQ) is not bounded above by 1. The more 
locally cormected commimities a network has, the bigger 
its LQ can grow. In a network where all communities 
are connected to each other, LQ yields the same value 
as Q. LQ considers individual communities and their 
neighbors, and therefore provides a measure of commu¬ 
nity quality that is not dependent on other parts of the 
network. The local connectivity approach can be applied 
not only to the nearest neighboring communities, but 
also to the second or higher neighbors as well. 

Arenas et al. proposed a multiple resolution method 
[34] which is based on the idea that it might be possible 
to look at the detected community structure at different 
scales. From this perspective, the modularity resolution 
limit is not a problem but a feature. It allows choosing a 
desired resolution level to achieve the required granular¬ 
ity of the output community structure using the original 
definition of modularity. 

The multiple resolution method is based on the defini¬ 
tion of modularity given by Equation (1). The modular¬ 
ity resolution limit depends on the total weight 2|i?|. 
By varying the total weight, it is possible to control 
the resolution limit, effectively performing community 
detection at different granularity levels. Changing the 
sum of weights of edges adjacent to every node by 
some value r results in rescaling topology by a factor 
of r. Since the resolution limit is proportional to ^/r, the 
growth of the resolution limit is slower than that of r. 
Consequently, it would be possible to achieve a scale at 
which all required communities would be visible to the 
modularity optimization problem. 


lq=Y. 

Ci&C 




e: 


iTieighb 



+ 

^out 


^neighb 




Ar = A + rl, 


(34) 


where A is the adjacency matrix and I is the identity ma¬ 
trix. Since the original edge weights are not altered, A^ 
preserves all common features of the network: distribu¬ 
tion of sum of weights, weighted clustering coefficient, 
eigenvectors, etc. Essentially, introducing r results in a 
self-loop of weight r being added to every node in the 
network. 

Optimizing the modularity for the rescaled topology 
Ar is performed by using the modularity at scale r as 
the new quality function: 


Qr=Y. 

CiGC 


'2|U“|+r|c,| 

nEr,\+r\c,\V 

2\E\+r\V\ 

\2\E\+r\V\) _ 


(35) 


where \ci\ is the number of nodes in community Ci and 
\Eci\ = 2|U™| -|- It yields larger communities for 

smaller values of r and smaller communities for larger 
values of r. By performing modularity optimization 
for different values of r, it is possible to analyze the 
community structure at different scales. 

Parameter r can also be thought of as representing 
resistance of a node to become part of a community. 
If r is positive, we can obtain a network community 
structure that is more granular than what was possible 
to achieve with the original definition of modularity (Q) 
which corresponds to r being zero. Making r negative 
zooms out of the network and provides a view of super 
communities. 

Further studies of the multiple resolution approach 
revealed that it suffers from two major issues outlined 
in [35]. First, when the value of the resolution parameter 
r is low it tends to group together small communi¬ 
ties. Second, when the resolution is high, it splits large 
communities. These trends are opposite for networks 
with a large variation of community sizes. Hence, it is 
impossible to select a value of the resolution parameter 
such that neither smaller nor larger communities are 
adversely affected by the resolution limit. A network can 
be tested for susceptibility to the resolution problem by 
examining its clustering coefficient, i.e. a degree to which 
nodes tend to form communities. If the clustering coeffi¬ 
cient has sharp changes, it indicates that communities of 
substantially different scales exist in this network. The 
result is that when the value of r is sufficiently large, 
bigger communities get broken up before smaller com¬ 
munities are found. This applies also to other multiple 
resolution methods and seems to be a general problem 
of the methods that are trying to optimize some global 
measure. 

The hierarchical multiresolution method proposed by 
Granell et al. in [36] overcomes the limitations of the 
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multiple resolution method on networks with very dif¬ 
ferent scales of communities. It achieves that by intro¬ 
ducing a new hierarchical multiresolution scheme that 
works even in cases of community detection near the 
modularity resolution limit. The main idea underlying 
this method is based on performing multiple resolution 
community detection on essential parts of the network, 
thus analyzing each part independently. 

The method operates iteratively by first placing all 
nodes in a singe community. Then, it finds the minimum 
value of the resistance parameter r which produces a 
community structure with the optimal value of modu¬ 
larity. Finally, it runs the same algorithm on each com¬ 
munity that was found. The method terminates when no 
more split of communities is necessary, which usually 
takes just a few steps. 

Another approach to leveraging the results of modu¬ 
larity optimization has been introduced by Chakraborty 
et al. in [27]. It is based on the observation that a 
simple change to the order of nodes in a network can 
significantly affect the community structure. However, 
a closer examination of the communities produced in 
different runs of a certain commrmity detection algo¬ 
rithm reveals that for many networks the same invariant 
groups of nodes are consistently assigned to the same 
communities. Such groups of nodes are called constant 
communities. The percentage of constant commrmities 
varies depending on the network. Constant communi¬ 
ties are detected by trying different node permutations 
while preserving the degree sequence of the nodes. For 
networks that have strong community structure, the 
constant communities detected can be adopted as a pre¬ 
processing step before performing modularity optimiza¬ 
tion. This can lead to higher modularity values and 
lower variability in results, thus improving the overall 
quality of community detection. 

In the study [37] by Li, Zhang et al., a new quanti¬ 
tative measure for community detection is introduced. 
It offers several improvements over the modularity (Q), 
including elimination of the resolution limit and ability 
to detect the number of communities. The new measure 
called modularity density {D) is based on the average 
degree of the community structure. It is given by: 


^=E 

CiGC 


2 |£’*”| - \E°f\ 



(36) 


The quality of the communities found is then described 
by the value of the modularity density (D). The larger 
the value of D, the stronger the community structure is. 

The modularity density (D) does not divide a clique 
into two parts, and it can resolve most modular networks 
correctly. It can also detect communities of different 
sizes. This second property can be used to quantitatively 
determine the number of communities, since the maxi¬ 
mum D value is achieved when the network is supposed 
to correctly partitioned. Although as mentioned in [37] 
finding an optimal value of modularity density (D) is 


NP-hard, it is equivalent to an objective function of the 
kernel k means clustering problem for which efficient 
computational algorithms are known. 

Traag et al. in [38] introduce a rigorous definition of 
the resolution-limit-free method for which considering any 
induced subgraph of the original graph does not cause 
the detected community structure to change. In other 
words, if there is an optimal partitioning of a network 
(with respect to some objective function), and for each 
subgraph induced by the partitioning it is also optimal, 
then such objective function is called resolution-limit- 
free. An objective fimction is called additive for a certain 
partitioning if it is equal to the sum of the values of this 
objective function for each of the subgraphs induced by 
the partitioning. 

Based on these two definitions it is proved that if an 
objective function is additive and there are two optimal 
partitions, then any combination of these partitions is 
also optimal. In case of a complete graph, if an objective 
function is resolution-limit-free, then an optimal parti¬ 
tioning either contains all the nodes (i.e. there is only 
one community which includes all nodes) or consists of 
communities of size 1 (i.e. each node forms a commrmity 
of its own). A more general statement for arbitrary 
objective frmctions is also true: if an objective function 
has local weights (i.e. weights that do not change when 
considering subgraphs) then it is resolution-limit-free. 
Although the converse is not true, there is only a rela¬ 
tively small number of special cases when methods with 
non-local weights are resolution-limit-free. 

The authors then analyze resolution-limit-free within the 
framework of the first principle Potts model [39]: 

■H = - ^ {aijAij - bij (1 - Aij)) Sc,,cj, (37) 
o 

where , bij > 0 are some weights. The intuition behind 
this formula is that a commrmity should have more 
edges inside it than edges which cormect it to other 
communities. Thus, it is necessary to reward existing 
links inside a community and penalize links that are 
missing from a community. The smaller the value of H is, 
the more desirable the community structure is. However 
the minimal value might not be unique. 

Given the definition of H, it is possible to describe 
various existing community detection methods with an 
appropriate choice of parameters, as well as propose 
alternative methods. The following models are shown 
to fit into "H: Reichardt and Bornholdt (RB), Arenas, 
Fernandes, and Gomez (AFG), Ronhovde and Nussi- 
nov (RN) as well as the label propagation method. RB 
approach with a configuration null model also covers 
the original definition of modularity. The authors also 
propose a new method called constant Potts model 
(CPM) by choosing atj = wtj — bij and bij = 7 where Wij 
is the weight of the edge between nodes i and j, and 
7 is a constant. CPM is similar to RB and RN models 
but is simpler and more intuitive. CPM and RN have 
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local weights and are consequently resolution-limit-free, 
while RB, AFG, and modularity are not. 

However, all of fhe above approaches are aimed at 
solving only the resolution limit problem. Work done by 
Chen et al. in [23], [24] adopts a different definition of 
modularify densify which simultaneously addresses two 
problems of modularity. It is done by mixing two addi¬ 
tional components. Split Penalty (SP) and the community 
density, into the well-known definition of modularify. 
Communify densify includes infernal communify den¬ 
sity and pair-wise community density. Split Penalty (SP) 
is the fraction of edges fhat connecf nodes of different 
communities: 



Fig. 1. A simple network with two clique communities. 
Each clique has four nodes and the two clique commu¬ 
nities are connected to each other with one single edge. 
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(38) 


The value of Split Penalty is subfracted from modularity, 
while the value of fhe community density is added to 
modularity and Split Penalty. Introducing Split Penalty 
resolves the issue of favoring small communities. Com¬ 
munity density eliminates the problem of favoring large 
communities (also known as the resolution limit prob¬ 
lem). The Modularity Density (Qds) is then given by: 
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where is the internal density of community c^, d^^cj 
is the pair-wise density between community and 
community Cj. 

Modularity Density {Qds) avoids falling info fhe trap 
of merging two or more consecutive cliques in the ring 
of cliques network or dividing a clique into two or 
more parts. It can also discover commimities of differenf 
sizes. Thus, using Qds solves both the resolution limit 
problem of modularify and the problem of splifting 
larger commimifies info smaller ones. Hence, Qds is an 
very effecfive alternative to Q. 


3 Fine-tuned Algorithm 

In our previous papers [23], [24], we have given the 
definition of Modularity Density (Qds)- With formal proofs 
and experiments on two real dynamic datasets (Senate 
dataset [40] and Reality Mining Bluetooth Scan data [41]) 
we demonstrated that Qds solves the two opposite yet 
coexisting problems of modularity: the problem of favor¬ 
ing small communities and the problem of favoring large 
communities (also called the resolution limit problem). 


Moreover, for a given commimify in Qds defined by 
Equation (39), its internal and pair-wise densities and its 
split penalty are local components, which is related to the 
resolution-limit-free definition in [38]. Therefore, if is rea¬ 
sonable to expect that maximizing Qds would discover 
more meaningful communify structure than maximizing 
Q. In this section, we first illustrate why the greedy 
agglomerative algorithm for increasing Qds cannot be 
adopted for optimizing Qds- Then, we propose a fine- 
funed communify detection algorithm that repeatedly at¬ 
tempts to improve the community quality measurements 
by splitting and merging the given network community 
structure to maximize Qds- 

3.1 Greedy Algorithm Fails to Optimize Qds 

In this subsection, we show why the greedy agglomera¬ 
tive algorithm increasing Qds fails fo optimize it. At the 
first step of fhe greedy algorifhm for increasing Qds, each 
node is treafed as a single communify. Then, Qds of each 
node or communify is Qds = —SP. Therefore, in order fo 
increase Qds the most, the greedy algorithm would first 
merge the connected pair of nodes with the sum of their 
degrees being the largest among all connected pairs. 
However, it is very likely that those two nodes belong to 
two different communities, which would finally result in 
merging those two communities instead of keeping them 
separate. This will result in a much lower value of Qds 
for such a merged communify compared fo Qds for ifs 
componenfs, demonsfrafing fhe reason for greedy Qds 
algorifhm failure in optimizing Qds- 

For example, in the network example in Figure 1, the 
initial values of Qds for nodes 1,2,4,6,7, and 8 with 
degree 3 are Qds = —SP = — ^ while the initial values 
of Qds for nodes 3 and 5 wifh degree 4 are Qds = —SP = 
— Then, greedy Qds algorithm would first merge node 
3 and node 5, which would finally lead to a single 
community of the whole eight nodes. However, the true 
community structure contains two clique communities. 
Accordingly, the Qds of fhe communify sfructure with 
two clique communities, 0.4183, is larger than that of fhe 
communify structure with one single large community, 
0.2487. So, maximizing Qds properly should have the 
ability to discover the true community structure. 
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3.2 Fine-tuned Algorithm 

In this part, we describe a fine-tuned community de¬ 
tection algorithm that iteratively improves a community 
quality metric M by splitting and merging the given 
network community structure. We denote the corre¬ 
sponding algorithm based on modularity (Q) as Fine- 
tuned Q and the one based on Modularity Density (Qds) 
as Fine-tuned Qds. It consists of two alternating stages: 
split stage and merging stage. 


Algorithm 1 Split_Communities(G', C) 

1 : Initialize comWeights[|C|][|C'|], comEdges[|C'|][|C|], 
and comDensities[|C|][|C'|] which respectively con¬ 
tain #weights, #edges, and the density inside the 
communities and between two commimities by us¬ 
ing the network G and the community list C; 

2 : / /Get the metric value for each community. 

3: Mes[|C'|] = GetMetric(G,comWeights,comDensities); 
4: for i = 0 to |G| — 1 do 
5: c = G.get(f); 

6 : subnet = GenerateSubNetwork(c); 

7: fiedlerVector[|c|] = LanczosMethod(subnet); 

8 : nodeIds[|c|] = sort(fiedlerVector, 'descend'); 

9: //Form |c| 4-1 divisions and record the best one. 

10: splitTwoCom.addAll(nodelds); 

11 : for j = 0 to |c| — 1 do 

12 : splitOneCom.add(nodeIds[j]); 

13: splitTwoCom.remove(nodeIds[j]); 

14: Galculate M(split) for the split at j; 

15: AM = M (split) — Mes[i]; 

16: if AM (best) < AM (or AM (best) > AM) then 

17: AM (best) = AM; 

18: bestidx = j; 

19: end if 

20: end for 

21 : if AM (best) > 0 (or AM (best) < 0) then 

22 : Glear splitOneCom and splitTwoCom; 

23: splitOneCom.addAll(nodeIds[0:&est/(ia;]); 

24: splitTwoCom.addAll(nodeIds[&est/(ia;-l-l:|c| — 1); 

25: newC.add(splitOneGom); 

26: newC.add(splitTwoCom); 

27: else 

28: newC.add(c); 

29: end if 

30: end for 
31: return newC 


In the split stage, the algorithm will split a commu¬ 
nity c into two subcommunities ci and C 2 based on 
the ratio-cut method if the split improves the value of 
the quality metric. The ratio-cut method [42] finds the 
bisection that minimizes the ratio , , where \Ec, c, I 

is the cut size (namely, the number of edges between 
communities ci and C 2 ), while |ci| and |c 2 | are sizes of 
the two commimities. This ratio penalizes situations in 
which either of the two communities is small and thus 
favors balanced divisions over unbalanced ones. How¬ 


ever, graph partitioning based on the ratio-cut method 
is a NP-complete problem. Thus, we approximate it by 
using the Laplacian spectral bisection method for graph 
partitioning introduced by Fiedler [43], [44]. 

First, we calculate the Fiedler vector which is the eigen¬ 
vector of the network Laplacian matrix L = D — A 
corresponding to the second smallest eigenvalue. Then, 
we put the nodes corresponding to the positive values 
of the Fiedler vector into one group and the nodes cor¬ 
responding to the negative values into the other group. 
The subnetwork of each community is generated with 
the nodes and edges in that community. Although the 
ratio-cut approximated with spectral bisection method 
does allow some deviation for the sizes |ci| and |c 2 | to 
vary around the middle value, the right partitioning may 
not actually divide the community into two balanced or 
nearly balanced ones. Thus, it is to some extent inappro¬ 
priate and unrealistic for community detection problems. 
We overcome this problem by using the following strate¬ 
gies. First, we sort the elements of the Fiedler vector in 
descending order, then cut them into two communities 
in each of the |c| -t- 1 possible ways and calculate the 
corresponding change of the metric values AM of all 
the |c| 4-1 divisions. Then, the one with the best value 
(largest or smallest depending on the measurement) 
of the quality metric AM (best) among all the |c| -4 1 
divisions is recorded. We adopt this best division to the 
community c only when AM (best) > 0 (or AM (best) < 0 
depending on the metric). For instance, we split the 
community only when AQds(best) is larger than zero. 

The outline of the split stage is shown in Algorithm 1. 
The input is a network and a community list, and the 
output is a list of commimities after splitting. The initial¬ 
ization part has 0(\E\) complexity. Computing Fiedler 
vector using Lanczos method [28] needs 0(\E\Kh 4- 
\V\K^h-\-K^h) steps, where K is the number of eigenvec¬ 
tors needed and h is the number of iterations required 
for the Lanczos method to converge. Here, AT is 2 and 
h is typically very small although the exact number is 
not generally known. So, the complexity for calculating 
Fiedler vector is OdAl 4- |H|). Sorting the Fiedler vector has 
the cost 0(\V\log\V\). The search of the best division 
from all the |c| 4- 1 possible ones (per community c) 
for all the communities is achieved in 0(\E\) time. For 
the |c| 4- 1 possible divisions of a community c, each 
one differs from the previous one by the movement of 
just a single node from one group to the other. Thus, 
the update of the total weights, the total number of 
edges, and the densities inside those two split com¬ 
munities and between those two communities to other 
communities can be calculated in time proportional to 
the degree of that node. Thus, all nodes can be moved 
in time proportional to the sum of their degrees which is 
equal to 2|£;|. Moreover, for Fine-tuned Qds, computing 
Qds(split) costs 0(|G||14|) because all the communities 
are traversed to update the Split Penalty for each of the 
|c| 4-1 divisions of each community c. All the other parts 
have complexity less than or at most 0(|14|). Thus, the 
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computational complexity for the split stage of Fine-tuned 
Q is 0{\E\ + |t/|/o( 7 |y|) while for Fine-tuned Qds it is 
0{\E\ + \V\log\V\ + \C\\V\). 

Algorithm 2 Merge_Communities(G', C) 

1 : Initialize comWeights[|C|][|C'|], comEdges[|C'|][|C|], 
and comDensities[|C|][|C'|]; 

2 : / /Get the metric value for each community. 

3: Mes[|G|] = GetMetric(G,comWeights,comDensities); 
4 : for j = 0 to |G| — 1 do 
5: for j = i + 1 to |G| — 1 do 

6 : //Doesn't consider discormected communities. 

7: if comWeights[i][/]==0 && 

comWeights[j][z]==0 then 
8: continue; 

9: end if 

10 : Calculate M{merge) for merging Ci and c^; 

11 : AM = M {merge) — Mes[i\ — Mes[j]; 

12 : //Record the merging information with \AM\ 

descending in a red-black tree 
13: if AM > 0 (or AM < 0) then 

14: mergedInfos.put([|AM|, i, /]); 

15: end if 

16: end for 

17: end for 

18: //Merge the community with the one that improves 
the value of the quality metric the most 
19: while mergedInfos.hasNextO do 
20 : [AM, comidl, comId2]=mergedInfos.next(); 

21 : if !mergedComs.containsKey(comIdl) && 

!mergedComs.containsKey(comId2) then 
22 : mergedComs.put(comIdl,comId2); 

23: mergedComs.put(comId2,comIdl); 

24: end if 

25: end while 

26: for i = 0 to |G| — 1 do 

27: Ci=G.get(i); 

28: if mergedComs.containsKey(i) then 

29: comId2 = mergedComs.get(i); 

30: if i < comId2 then 

31: Ci.addAll(G.get(comId2)); 

32: end if 

33: end if 

34: newC.add(ci); 

35: end for 

36: return newC; 


In the merging stage, the algorithm will merge a 
community to its cormected communities if the merg¬ 
ing improves the value of the quality metric. If there 
are many mergers possible for a community, the one, 
unmerged so far, which improves the quality metric 
the most is chosen. Hence, each community will only 
be merged at most once in each stage. The outline of 
the merging stage is shown Algorithm 2. The input is 
a network and a community list, and the output is a 
list of communities after merging. The initialization part 
has the complexity 0(|i?|). For Fine-tuned Q, the two 


"for loops" for merging any two communities have the 
complexity 0{\C\'^log\C\) because calculating Q{merge) 
is 0 ( 1 ) and inserting an element into the red-black tree is 
0{log\C\'^) = 0{2log\C\) ^ 0{log\C\) since the maximum 
number of elements in the tree is = 0 ( 10 ^). 

For Fine-tuned Q^s, the two "for loops" for merging any 
two communities have the complexity 0 ( 10 ^) because 
calculating Qds{'merge) needs 0(101) steps to traverse all 
the communities to update the Split Penalty and inserting 
an element into the red-black tree is 0{log\C\) as well. 
The other parts all have complexity at most 0(|0p). 
Thus, the computational complexity for the merging 
stage of Fine-tuned Q is 0{\E\ \C\'^log\C\) and for the 

merging stage of Fine-tuned Qds is 0{\E\ + |0|^). 

Algorithm 3 Fine-tuned_Algorithm(G, C) 

1 : comSize = |0|; 

2 : splitSize = 0; 

3: mergeSize = 0; 

4: while comSize!=splitSize || comSize!=mergeSize do 
5: comSize = |0|; 

6 : C = Split_Communities(G, G); 

7: splitSize = |G|; 

8 : G=Merge_Commimities(G, G); 

9: mergeSize = |G|; 

10 : end while 
11 : return G 


The fine-tuned algorithm repeatedly carries out those 
two alternating stages until neither split nor merging can 
improve the value of the quality metric or until the total 
number of communities discovered does not change 
after one full iteration. Algorithm 3 shows the outline 
of the fine-tuned algorithm. It can detect the commrmity 
structure of a network by taking a list with a single 
community of all the nodes in the network as the input. 
It can also improve the community detection results of 
other algorithms by taking a list with their communities 
as the input. Let the number of iteration of the fine-timed 
algorithm be denoted as T. Then, the total complexity 
for Fine-tuned Q is 0{T{\E\ + \V\log\V\ + \C\^log\C\)) 
while for Fine-tuned Qds it is 0{T{\E\ |l/j/o 5 |G| -I- 

|G||G| -I- |Gp)). Assuming that T and |G| are constants, 
the complexity of the fine-timed algorithms reduces to 
0(|G| -I- |l/|lo(;|G|). The only part of the algorithm that 
would generate a non-deterministic result is the Lanczos 
method of calculating the Fiedler vector. The reason is that 
Lanczos method adopts a randomly generated vector 
as its starting vector. We solve this issue by choosing 
a normalized vector of the size equal to the number of 
nodes in the community as the starting vector for the 
Lanczos method. Then, community detection results will 
stay the same for different runs as long as the input 
remains the same. 

4 Experimental Results 

In this section, we first introduce several popular mea¬ 
surements for evaluating the quality of the results of 
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community detection algorithms. Denoting the greedy 
algorithm of modularity maximization proposed by 
Newman [7] as Greedy Q, we then use the mentioned 
above metrics to compare Greedy Q, Fine-tuned Q, and 
Fine-tuned Qds- The comparison uses four real networks, 
the classical clique network and the LFR benchmark 
networks, each instance of which is defined with pa¬ 
rameters each selected from a wide range of possible 
values. The results indicate that Fine-tuned Qds is the 
most effective method among the three, followed by 
Fine-tuned Q. Moreover, we show that Fine-tuned Qds can 
be applied to significantly improve the detection results 
of other algorithms. 

In Subsection 2.2.2, we have shown that the modu¬ 
larity maximization approach using the eigenvectors of 
the Laplacian matrix is equivalent to the one using the 
eigenvectors of the modularity matrix. This implies that 
the split stage of our Fine-tuned Q is actually equivalent 
to the spectral methods. Therefore, Fine-tuned Q with one 
additional merge operation at each iteration unquestion¬ 
ably has better performance than the spectral algorithms. 
Hence, we do not discuss them here. 

4.1 Evaluation Metrics 

The quality evaluation metrics we consider here can be 
divided into three categories: Variation of Information (VI) 
[45] and Normalized Mutual Information (NMI) [46] based 
on irrformation theory; F-measure [47] and Normalized Van 
Dongen metric (NVD) [48] based on cluster matching; 
Rand Index (RI) [49], Adjusted Rand Index (ARI) [50], and 
Jaccard Index (JI) [51] based on pair counting. 


we can express VI and NMI as a function of counts 
only as follows: 


Viic , c ') = -^ |c.nc'|iog' 


1^1 


Ciccy.ec 


Ci C, 


NMI{C,C') = 


o IciDc 

^^Ci£C,c'£C' |y| 


/ |c.nc;i|y| \ 
Icillc'l ) 


(44) 


SciGC M log M + Ec' eC' M log |p| 


(45) 

where | q | is the number of nodes in community Ci of C 
and \ci (T c'| is the number of nodes both in commimity 
Ci of C and in community c' of C. 


4.1.2 Clustering Matching Based Metrics 
Measurements based on clustering matching aim at find¬ 
ing the largest overlaps between pairs of communities 
of two partitions C and C. F-measure [47] measures 
the similarity between two partitions, while Normalized 
Van Dongen metric (NVD) [48] quantifies the "distance" 
between partitions C and C. F-measure is defined as 


F-measure(C, C') 
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NVD is given by 


NVD(C, C") = 1 - 




(47) 


4.1.1 Information Theory Based Metrics 
Given partitions C and C, Variation of Information (VI) 
[45] quantifies the "distance" between those two parti¬ 
tions, while Normalized Mutual Information (NMI) [46] 
measures the similarity between partitions C and C' .VI 
is defined as 


VI(C, C) = H(C) + H{C') - 2/(67,67') 
= //(67,67')-/(67,67'), 


(40) 


where H{.) is the entropy fimction and I(C,C') = 
H(C)-\-Il(C') —H(C, 67') is the Mutual Information. Then, 
NMI is given by 


NMI(C, 67') = 


2/(67,67') 
H{C) + H(C') ■ 


(41) 


Using the definitions 
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4.1.3 Pair Counting Based Metrics 
Metrics based on pair counting cormt the number of 
pairs of nodes that are classified (in the same community 
or in different communities) in two partitions 67 and 67'. 
Let ail indicate the number of pairs of nodes that are 
in the same community in both partitions, aio denote 
the number of pairs of nodes that are in the same 
community in partition 67 but in different communities 
in 67', aoi be the number of pairs of nodes which are in 
different communities in 67 but in the same commimity 
in 67', ago be the number of pairs of nodes which are in 
different communities in both partitions. By definition, 
A = ail + oio + ooi + aoo = is the total number 

of pairs of nodes in the network. Then, Rand Index (RI) 

[49] which is the ratio of the number of node pairs placed 
in the same way in both partitions to the total number 
of pairs is given by 

RliC, 67') = ^ °°° . (48) 

Denote M = ^(aii -f aio)(aii -I- aoi). Then, RI's corre¬ 
sponding adjusted version. Adjusted Rand Index (ARI) 

[50] , is expressed as 

ail — M 

5 [(oii + Oio) + (oii -I- ooi)] ~ T/ 


(43) 
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TABLE 1 

Metric values of the community structures discovered by Greedy Q, Fine-tuned Q, and Fine-tuned Qds on Zachary’s 
karate club network (red italic font denotes the best value for each metric). 


Algorithm 

Q 

Qds 

VI 

NMI 

F-measure 

NVD 

RI 

ARI 

JI 

Greedy Q 

0.3807 

0.1809 

0.7677 

0.6925 

0.828 

0.1471 

0.8414 

0.6803 

0.6833 

Fine-tuned Q 

0.4198 

0.2302 

0.9078 

0.6873 

0.807 

0.1618 

0.7736 

0.5414 

0.5348 

Fine-tuned Qds 

0.4174 

0.231 

0.8729 

0.6956 

0.8275 

0.1471 

0.7861 

0.5669 

0.5604 



(c) Communities detected with Fine-tuned Q. (d) Communities detected with Fine-tuned Qds- 

Fig. 2. The community structures of the ground truth communities and those detected by Greedy Q, Fine-tuned Q, 
and Fine-tuned Qds on Zachary’s karate club network. 


The Jaccard Index (JI) [51] which is the ratio of the 
number of node pairs placed in the same community 
in both partitions to the number of node pairs that are 
placed in the same group in at least one partition is 
defined as 

J/(C, C") =- — -. (50) 

Oil + aio + ooi 

Each of these three metrics quantifies the similarity 
between two partitions C and C . 

4.2 Real Networks 

In this subsection, we first evaluate the performance 
of Greedy Q, Fine-tuned Q, and Fine-tuned Qds on two 
small networks (Zachary's karate club network [52] and 
American college football network [53]) with ground 
truth communities, and then on two large networks 
(PGP network [54] and AS level Internet) but without 
ground truth commimities. 

4.2.1 Zachary’s Karate Ciub Network 
We first compare the performance of Greedy Q, Fine- 
tuned Q, and Fine-tuned Qds on Zachary's karate club 
network [52]. It represents the friendships between 34 


members of a karate club at a US university over a 
period of 2 years. During the observation period, the 
club split into two clubs as a result of a conflict within 
the organization. The resulting two new clubs can be 
treated as the ground truth communities whose structure 
is shown in Figure 2(a) visualized with the opensource 
software Gephi [55]. 

Table 1 presents the metric values of the community 
structures detected by the three algorithms on this net¬ 
work. It shows that Fine-tuned Q and Fine-tuned Qds 
achieve the highest value of Q and Qds, respectively. 
However, most of the seven metrics based on ground 
truth commimities imply that Greedy Q performs fhe best 
with only NMI and NVD indicating that Fine-tuned Qds 
has the best performance among the three algorithms. 
Hence, it seems that a large Q or Qds may not necessary 
mean a high quality of community structure, especially 
for Q because Fine-tuned Q achieves the highest Q but 
has the worst values of the seven metrics described in 
Subsection 4.1. We argue that the ground truth com¬ 
munities may not be so reasonable because Fine-tuned 
Q and Fine-tuned Qds m fact discover more meaningful 
communities than Greedy Q does. Figures 2(a)-2(d) show 
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TABLE 2 

Metric values of the community structures detected by Greedy Q, Fine-tuned Q, and Fine-tuned Qds on American 
college football network (red italic font denotes the best value for each metric). 


Algorithm 

Q 

Qds 

VI 

NMI 

F-measure 

NVD 

RI 

ARI 

JI 

Greedy Q 

0.5773 

0.3225 

1.4797 

0.7624 

0.6759 

0.2304 

0.9005 

0.5364 

0.4142 

Fine-tuned Q 

0.5944 

0.3986 

0.9615 

0.8553 

0.8067 

0.1348 

0.9521 

0.7279 

0.6045 

Fine-timed Qds 

0.6005 

0.4909 

0.5367 

0.9242 

0.9145 

0.07391 

0.9847 

0.8967 

0.8264 


TABLE 3 


Metric values of the community structures of Greedy Q and Fine-tuned Q improved with Fine-tuned Qds on 
American college football network (blue italic font indicates improved score). 


Algorithm 

Q 

Qds 

VI 

NMI 

F-measure 

NVD 

RI 

ARI 

JI 

Greedy Q improved with Fine-tuned Qds 

0.5839 

0.4636 

0.6986 

0.9013 

0.8961 

0.0913 

0.9793 

0.8597 

0.7714 

Fine-tuned Q improved with Fine-tuned Qds 

0.5974 

0.4793 

0.5096 

0.9278 

0.9166 

0.06957 

0.9837 

0.8907 

0.8174 


the community structure of grormd truth communities 
and those detected by Greedy Q, Fine-tuned Q, and Fine- 
tuned Qds, respectively. For results of Greedy Q shown 
in Figure 2(b), we could observe fhat there are three 
communities located at the left, the center, and the 
right side of fhe nefwork. The ground trufh communify 
locafed on the right is subdivided into the central and 
right communities, but the node 10 is misclassified as be¬ 
longing to the central community, while in ground truth 
network it belongs to community located on the left. 
Figure 2(c) demonstrates that Fine-tuned Q subdivides 
both the left and the right communities into two with 
six nodes separated from the left community and five 
nodes separafed from fhe right community. Moreover, 
Figure 2(c) shows that Fine-tuned Q discovers the same 
number of communities for fhis nefwork as algorifhms 
presented in [9], [16], [20], [22]. In fact, the community 
structure it discovers is identical to those detected in 
[16], [20], [22]. Figure 2(d) shows that the community 
structure discovered by Fine-tuned Qds differs from fhat 
of Fine-tuned Q only on node 24 which is placed in the 
larger part of the left community. It is reasonable for it 
has three cormections to the larger part to which it has 
more attraction than to the smaller part with which it 
only has two cormections. 

In addition, analyzing the intermediate results of Fine- 
tuned Q and Fine-tuned Qds reveals thaf the communities 
at the first iteration are exactly the ground truth commu¬ 
nities, which in another way implies their superiority 
over Greedy Q. Moreover, NMI and NVD indicate that 
Fine-tuned Qds is the best among the three and all the 
metrics, except Q, show that Fine-tuned Qds performs bef- 
fer than Fine-tuned Q, supporting the claim that a higher 
Qds (but not Q) implies a better quality of communify 
structure. 

4.2.2 American Coiiege Footbaii Network 
We apply the three algorithms also to the American 
college football network [53] which represents the sched¬ 
ule of games befween college football teams in a sin¬ 
gle season. The teams are divided into twelve "confer¬ 
ences" with intra-conference games being more frequenf 
fhan infer-conference games. Those conferences could be 


TABLE 4 

The values of Q and Qds of the community structures 
detected by Greedy Q, Fine-tuned Q, and Fine-tuned 
Qds on PGP network (red italic font denotes the best 
value for each metric). 


Algorithm 

Q 

Qds 

Greedy Q 

0.8521 

0.04492 

Fine-tuned Q 

0.8405 

0.02206 

Fine-tuned Qds 

0.594 

0.287 


treated as the ground truth communities whose structure 
is shown in Figure 3(a). 

Table 2 presents the metric values of the community 
structures detected by the three algorithms. It shows 
that Fine-tuned Qds achieves the best values for all the 
nine metrics. It implies that Fine-tuned Qds performs 
best on this football network, followed by Fine-tuned 
Q. Figures 3(a)-3(d) present the commimity structure 
of ground fruth communities and those discovered by 
Greedy Q, Fine-tuned Q, and Fine-tuned Qds- Each color 
in the figures represents a community. It can be seen 
that there are twelve grormd truth communities in total, 
seven communities detected by Greedy Q, nine commu¬ 
nities discovered by Fine-tuned Q, and exactly twelve 
communities found by Fine-tuned Qds- 

Moreover, we apply Fine-tuned Qds on the community 
detection results of Greedy Q and Fine-tuned Q. The 
metric values of these two community structures after 
improvement with Fine-tuned Qds are shown in Table 3. 
Compared with those of Greedy Q and Fine-tuned Q in 
Table 2, we could observe that the metric values are 
significantly improved with Fine-tuned Qds- Further, both 
improved community structures contain exactly twelve 
communities, the same number as the ground truth 
communities. 

4.2.3 PGP Network 

We then apply the three algorithms on PGP network [54]. 
It is the giant component of the network of users of the 
Pretty-Good-Privacy algorithm for secure information 
interchange. It has 10680 nodes and 24316 edges. 

Table 4 presents the metric values of the community 
structures detected by the three algorithms. Since this 
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(c) Communities detected with Fine-tuned Q. (d) Communities detected with Fine-tuned Qds- 

Fig. 3. The community structures of the ground truth communities and those detected by Greedy Q, Fine-tuned Q, 
and Fine-tuned Qds on American coiiege footbaii network. 


network does not have ground truth communities, we 
only calculate Q and Qds of these discovered community 
structures. The table shows that Greedy Q and Fine-tuned 
Qds achieve the highest value of Q and Qds, respectively. 
It is worth to mention that the Qds of Fine-tuned Qds 
is much larger than that of Greedy Q and Fine-tuned Q, 
which implies that Fine-tuned Qds performs best on PGP 
network according to Qds, followed by Greedy Q. 


4.2.4 /AS Levei Internet 

The last real network dataset that is adopted to evaluate 
the three algorithms is AS level Internet. It is a sym¬ 
metrized snapshot of the structure of the Internet at the 
level of autonomous systems, reconstructed from BGP 
tables posted by the University of Oregon Route Views 
Project. This snapshot was created by Mark Newman 
from data for July 22, 2006 and has not been previously 
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TABLE 6 

Metric values of the community structures detected by Greedy Q, Fine-tuned Q, and Fine-tuned Qds on the classical 
clique network (red italic font denotes the best value for each metric). 


Algorithm 

Q 

Qds 

VI 

NMI 

F-measure 

NVD 

RI 

ARI 

JI 

Greedy Q 

0.8871 

0.46 

0.9333 

0.8949 

0.6889 

0.2333 

0.9687 

0.6175 

0.4615 

Fine-tuned Q 

0.8871 

0.46 

0.9333 

0.8949 

0.6889 

0.2333 

0.9687 

0.6175 

0.4615 

Fine-tuned Qds 

0.8758 

0.8721 

0 

1 

1 

0 

1 

1 

1 


TABLE 7 

Metric values of the community structures of Greedy Q and Fine-tuned Q improved with Fine-tuned Qds on the 
classical clique network (blue italic font indicates improved score). 


Algorithm 

Q 

Qds 

VI 

NMI 

F-measure 

NVD 

RI 

ARI 

JI 

Greedy Q improved with Fine-tuned Qds 

0.8758 

0.8721 

0 

1 

1 

0 

1 

1 

1 

Fine-tuned Q improved with Fine-tuned Qds 

0.8758 

0.8721 

0 

1 

1 

0 

1 

1 

1 


TABLE 5 

The values of Q and Qds of the community structures 
detected by Greedy Q, Fine-tuned Q, and Fine-tuned 
Qds on AS level Internet (red italic font denotes the best 
value for each metric). 


Algorithm 

Q 

Qds 

Greedy Q 

0.6379 

0.002946 

Fine-tuned Q 

0.6475 

0.003123 

Fine-tuned Qds 

0.3437 

0.03857 


published. It has 22963 nodes and 48436 edges. 

Table 5 presents the metric values of the community 
structures detected by the three algorithms. Since this 
network does not have ground truth communities either, 
we only calculate Q and Qds- It can be seen from the 
table that Fine-tuned Q and Fine-tuned Qds achieve the 
highest value of Q and Qds, respectively. Moreover, the 
Qds of Fine-tuned Qds is much larger than that of Greedy 
Q and Fine-tuned Q, which indicates that Fine-tuned Qds 
performs best on AS level Internet according to Qds, 
followed by Fine-tuned Q. 

4.3 Synthetic Networks 

4.3.1 Ciique Network 

We now apply the three algorithms to the classical 
network example [23]-[25], displayed in Figure 4, which 
illustrates modularity (Q) has the resolution limit prob¬ 
lem. It is a ring network comprised of thirty identical 
cliques, each of which has five nodes and they are 
cormected by single edges. It is intuitively obvious that 
each clique forms a single commimity. 

Table 6 presents the metric values of fhe community 
structures detected by the three algorithms. It shows that 
Greedy Q and Fine-tuned Q have the same performance. 
They both achieve the highest value of Q but get about 
half of the value of Qds of whaf Fine-tuned Qds achieves. 
In fact. Fine-tuned Qds finds exacfly thirty commimities 
with each clique being a single community. In contrast. 
Greedy Q and Fine-tuned Q discover only sixteen com¬ 
munities with fourteen communities having two cliques 
and the other two communities having a single clique. 
Also, we take the community detection results of Greedy 



Fig. 4. A ring network made out of thirty identical cliques, 
each having five nodes and connected by single edges. 

Q and Fine-tuned Q as the input to Fine-tuned Qds to 
try to improve those results. The metric values of the 
community structures after improvement with Fine-tuned 
Qds are recorded in Table 7. This table shows that the 
community structures discovered are identical to that of 
Fine-tuned Qds, which means thaf fhe results of Greedy Q 
and Fine-tuned Q are dramatically improved with Fine- 
tuned Qds- Therefore, it can be concluded from Tables 6 
and 7 fhat a larger value of Qds (but not Q) implies a 
higher quality of the community structure. Moreover, 
Qds solves the resolution limit problem of Q. Finally, 
Fine-tuned Qds is effective in maximizing Qds and in 
finding meaningful community structure. 

4.3.2 LFR Benchmark Networks 

To further compare the performance of Greedy Q, Fine- 
tuned Q, and Fine-tuned Qds, we choose the LFR bench¬ 
mark networks [56] which have become a standard in the 
evaluation of fhe performance of commimity detection 
algorithms and also have known ground truth commu¬ 
nities. The LFR benchmark network that we used here 
has 1000 nodes with average degree 15 and maximum 
degree 50. The exponent 7 for the degree sequence 
varies from 2 to 3. The exponent /3 for fhe community 
size distribution ranges from 1 to 2. Then, four pairs 
of fhe exponents ( 7 ,/?) = (2,1), (2,2), (3,1), and (3,2) 
are chosen in order to explore the widest spectrum 
of graph sfrucfures. The mixing paramefer p is varied 
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TABLE 8 

Metric values of the community structures of Greedy Q on the LFR benchmark networks with ( 7 ,/ 3 ) = ( 2 , 1 ). 


u 

Q 

Qds 

VI 

NMI 

F-measure 

NVD 

RI 

ARI 

JI 

0.05 

0.9021 

0.4481 

0.1403 

0.9767 

0.9382 

0.0399 

0.9959 

0.9308 

0.8758 

0.1 

0.8461 

0.3546 

0.5213 

0.9482 

0.8539 

0.0912 

0.9882 

0.821 

0.7089 

0.15 

0.7862 

0.2604 

0.8537 

0.9125 

0.7604 

0.1432 

0.9776 

0.7042 

0.5573 

0.2 

0.7256 

0.1934 

1.3601 

0.8579 

0.6314 

0.2173 

0.9601 

0.5445 

0.3911 

0.25 

0.6612 

0.1411 

1.7713 

0.8093 

0.5477 

0.2642 

0.9444 

0.4498 

0.309 

0.3 

0.5959 

0.09377 

2.1758 

0.7493 

0.4745 

0.3085 

0.921 

0.3779 

0.255 

0.35 

0.545 

0.07237 

2.4599 

0.7122 

0.4182 

0.3347 

0.9045 

0.3206 

0.2134 

0.4 

0.4857 

0.05521 

2.7444 

0.672 

0.3745 

0.3623 

0.8874 

0.2766 

0.1836 

0.45 

0.4356 

0.04133 

3.0108 

0.6289 

0.327 

0.3875 

0.8617 

0.2288 

0.153 

0.5 

0.3803 

0.03016 

3.4296 

0.5685 

0.2874 

0.4159 

0.8386 

0.1885 

0.1282 


TABLE 9 

Metric values of the community structures of Fine-tuned Q on the LFR benchmark networks with (7,/3) = (2,1). 


U 

Q 

Qds 

VI 

NMI 

F-measure 

NVD 

RI 

ARI 

JI 

0.05 

0.8411 

0.3875 

0.8674 

0.8868 

0.8137 

0.1049 

0.9404 

0.7503 

0.6673 

0.1 

0.8419 

0.3837 

0.5195 

0.9481 

0.8851 

0.0695 

0.9875 

0.8333 

0.7408 

0.15 

0.7886 

0.3324 

0.6453 

0.9358 

0.8664 

0.0844 

0.9858 

0.801 

0.6921 

0.2 

0.7221 

0.2922 

0.9615 

0.9022 

0.8056 

0.1222 

0.9725 

0.7099 

0.6061 

0.25 

0.6694 

0.2502 

1.11 

0.8833 

0.7831 

0.137 

0.9594 

0.7045 

0.5939 

0.3 

0.626 

0.2022 

1.0722 

0.892 

0.813 

0.1265 

0.9811 

0.7317 

0.5963 

0.35 

0.5479 

0.1516 

1.6786 

0.8153 

0.705 

0.1942 

0.949 

0.5963 

0.4629 

0.4 

0.5044 

0.124 

1.8382 

0.8108 

0.6935 

0.2111 

0.9646 

0.5592 

0.4118 

0.45 

0.4274 

0.07865 

2.5657 

0.7274 

0.5913 

0.2863 

0.9463 

0.4419 

0.3129 

0.5 

0.3766 

0.05808 

3.0333 

0.675 

0.5328 

0.3375 

0.9366 

0.3721 

0.2537 


TABLE 10 

Metric values of the community structures of Fine-tuned Qds on the LFR benchmark networks with ( 7 ,/?) = (2,1). 


M 

Q 

Qds 

VI 

NMI 

F-measure 

NVD 

RI 

ARI 

JI 

0.05 

0.845 

0.4257 

0.8112 

0.9186 

0.8564 

0.09585 

0.9736 

0.691 

0.5717 

0.1 

0.7934 

0.4144 

0.5809 

0.9447 

0.9326 

0.0625 

0.9915 

0.8566 

0.7646 

0.15 

0.7426 

0.3605 

0.6769 

0.9359 

0.9172 

0.0711 

0.9902 

0.8303 

0.7225 

0.2 

0.6786 

0.337 

0.7824 

0.9278 

0.9195 

0.0795 

0.9908 

0.8186 

0.7037 

0.25 

0.6202 

0.2891 

1.0244 

0.9046 

0.8909 

0.106 

0.9868 

0.7575 

0.6253 

0.3 

0.5693 

0.235 

1.1347 

0.8919 

0.8874 

0.1183 

0.9845 

0.7372 

0.5983 

0.35 

0.5443 

0.2244 

0.9401 

0.9123 

0.9129 

0.09585 

0.989 

0.7984 

0.6783 

0.4 

0.505 

0.1964 

0.9444 

0.9123 

0.9091 

0.0966 

0.989 

0.7929 

0.668 

0.45 

0.4536 

0.1632 

1.1523 

0.8925 

0.8806 

0.1196 

0.9834 

0.7337 

0.6021 

0.5 

0.3563 

0.1196 

1.9677 

0.8036 

0.7489 

0.2076 

0.9213 

0.4984 

0.3813 


from 0.05 to 0.5. It means that each node shares a 
fraction (1 — ^) of its edges with the other nodes in its 
community and shares a fraction n of its edges with 
the nodes outside its community. Thus, low mixing 
parameters indicate strong community structure. Also, 
we generate 10 network instances for each /i. Hence, 
each metric value in Tables 8-12 represents the average 
metric values of all 10 instances. Since the experimental 
results are similar for all four pairs of exponents ( 7 , /3) = 
(2,1), (2,2), (3,1), and (3,2), for the sake of brevity, we 
only present the results for ( 7 ,/!) = ( 2 , 1 ) here. 

Tables 8-10 show the metric values of the commu¬ 
nity structures detected with Greedy Q, Fine-tuned Q, 
and Fine-tuned Qds, respectively, on the LFR benchmark 
networks with ( 7 ,/?) = (2,1) and p varying from 0.05 
to 0.5. The red italic font in the table denotes that the 
corresponding algorithm achieves the best value for a 
certain quality metric among the three algorithms. The 
results in these tables show that Greedy Q obtains the 
best values for all the nine measurements when y = 0.05, 
while Fine-tuned Qds achieves the highest values of Qds 


and the best values for almost all the seven metrics based 
on ground truth commimities when y ranges from 0.1 
to 0.5. Also, Fine-tuned Q gets the second best values for 
Qds and almost all the seven metrics in the same range 
of y. However, for Q the best is Greedy Q, followed by 
Fine-tuned Q, and Fine-tuned Qds is the last. 

In summary, the seven measurements based on 
groimd truth communities are all consistent with Qds, 
but not consistent with Q. This consistency indicates 
the superiority of Qds over Q as a commrmity quality 
metric. In addition. Fine-tuned Qds performs best among 
the three algorithms for y > 0.05, which demonstrates 
that it is very effective and does a very good job in 
optimizing Qds- 

We then take the community detection results of 
Greedy Q and Fine-tuned Q as the input to Fine-tuned Qds 
to improve those results. The measurement values of the 
community structures after improvement with Fine-tuned 
Qds are displayed in Tables 11 and 12 . The blue italic font 
in Table 11 and Table 12 implies that the metric value 
in these two tables is improved compared to the one in 










































































TABLE 11 

Metric values of the community structures of Greedy Q improved with Fine-tuned Qds on the LFR benchmark 

networks with ( 7 , /3) = ( 2 , 1 ). 
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p 

Q 

Qds 

VI 

NMI 

F-measure 

NVD 

RI 

ARI 

JI 

0.05 

0.8743 

0.4979 

0.2131 

0.98 

0.9784 

0.02195 

0.997 

0.943 

0.895 

0.1 

0.8246 

0.4522 

0.2428 

0.9773 

0.9762 

0.02395 

0.9967 

0.9379 

0.8864 

0.15 

0.7716 

0.4013 

0.2972 

0.9722 

0.9719 

0.0289 

0.9962 

0.9269 

0.8674 

0.2 

0.7232 

0.384 

0.3503 

0.9679 

0.9664 

0.03505 

0.9959 

0.9163 

0.8496 

0.25 

0.6667 

0.3347 

0.4474 

0.9592 

0.9582 

0.04485 

0.9953 

0.9011 

0.8243 

0.3 

0.6094 

0.2619 

0.6061 

0.9432 

0.9457 

0.05905 

0.9934 

0.876 

0.7856 

0.35 

0.5584 

0.2377 

0.691 

0.9364 

0.94 

0.0697 

0.9931 

0.8615 

0.7626 

0.4 

0.5062 

0.199 

0.8285 

0.9236 

0.9247 

0.0823 

0.9916 

0.8376 

0.7281 

0.45 

0.4587 

0.169 

0.9016 

0.9172 

0.9222 

0.0904 

0.9914 

0.8252 

0.7099 

0.5 

0.4014 

0.1385 

1.2004 

0.8906 

0.8938 

0.1215 

0.9885 

0.7686 

0.6326 


TABLE 12 

Metric values of the community structures of Fine-tuned Q improved with Fine-tuned Qds on the LFR benchmark 

networks with ( 7 , /3) = (2,1). 


P 

Q 

Qds 

VI 

NMI 

F-measure 

NVD 

RI 

ARI 

JI 

0.05 

0.8519 

0.4463 

0.5949 

0.937 

0.8954 

0.0709 

0.9781 

0.8177 

0.7377 

0.1 

0.8186 

0.4397 

0.3405 

0.9679 

0.9615 

0.03415 

0.9952 

0.9125 

0.8452 

0.15 

0.769 

0.391 

0.4285 

0.9597 

0.9533 

0.0432 

0.9946 

0.8993 

0.8231 

0.2 

0.7185 

0.369 

0.4654 

0.9571 

0.9479 

0.04975 

0.9943 

0.8853 

0.8014 

0.25 

0.6672 

0.326 

0.5667 

0.9477 

0.9365 

0.05805 

0.9936 

0.8713 

0.7785 

0.3 

0.6109 

0.2598 

0.6962 

0.9346 

0.9372 

0.06505 

0.9926 

0.8609 

0.762 

0.35 

0.5474 

0.2297 

0.9525 

0.9108 

0.9175 

0.0961 

0.9882 

0.7963 

0.6821 

0.4 

0.4966 

0.1983 

1.0601 

0.9021 

0.9118 

0.1029 

0.9896 

0.7963 

0.672 

0.45 

0.4284 

0.1535 

1.4754 

0.8635 

0.8694 

0.1486 

0.9831 

0.6836 

0.5362 

0.5 

0.3654 

0.1258 

1.9271 

0.8192 

0.8193 

0.1987 

0.968 

0.5852 

0.4423 


Table 8 and that in Table 9, respectively. Then, compared 
with those of Greedy Q shown in Table 8 and those of 
Fine-tuned Q shown in Table 9, all measurements, except 
in some cases for Q, are significanfly improved wifh Fine- 
tuned Qds- This again indicates that all the seven metrics 
described in Subsection 4.1 are consistent with Qds, but 
not consistent with Q. Interestingly, those results are 
even better than those of Fine-tuned Qds itself presented 
in Table 10. Thus, it can be concluded that Fine-tuned Qds 
is very powerful in improving the community detection 
results of ofher algorithms. 

5 Conclusion 

In this paper, we review the definition of modularify 
and its corresponding maximization methods. Moreover, 
we show that modularity optimization has two opposite 
but coexisting issues. We also review several community 
quality metrics proposed to solve the resolution limit 
problem. We then discuss our Modularity Density (Qds) 
metric which simultaneously avoids those two problems. 
Finally, we propose an efficient and effective fine-timed 
algorithm to maximize Qds- This new algorithm can 
actually be used to optimize any community quality 
metric. We evaluate the three algorithms. Greedy Q, Fine- 
tuned Q based on Q, and Fine-tuned Qds based on Qds, 
with seven metrics based on ground truth communities. 
These evaluations are done on four real networks, and 
also on the classical clique network and the LFR bench¬ 
mark networks, each instance of the last is defined with 
parameters selected from wide range of their values. 
The results demonstrate that Fine-tuned Qds performs 


best among the three algorithms, followed by Fine-tuned 
Q. The experiments also show that Fine-tuned Qds can 
dramatically improve the commimity detection results of 
other algorithms. In addition, all the seven quality met¬ 
rics based on ground truth communities are consistent 
with Qds, but not consistent with Q, which indicates the 
superiority of Qds over Q as a community quality metric. 
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